An Efficient Dimensionality Reduction Approach for Small-sample Size and High-dimensional Data Modeling

نویسندگان

  • Xintao Qiu
  • Dongmei Fu
  • Zhenduo Fu
چکیده

As for massive multidimensional data are being generated in a wide range of emerging applications, this paper introduces two new methods of dimension reduction to conduct small-sample size and high-dimensional data processing and modeling. Through combining the support vector machine (SVM) and recursive feature elimination (RFE), SVM-RFE algorithm is proposed to select features, and further, adding the higher order singular value decomposition (HOSVD) to the feature extraction which involves successfully organizing the data into high order tensor pattern. The validation of simulation experiment data shows that the proposed novel feature selection and feature extraction methods can be effectively applied to the research work for analyzing and modeling the data of atmospheric corrosion. The feature selection method pledges that the remaining feature subset is optimal; feature extraction method reserves the original structure, discriminate information, and the integrity of data, etc. Finally, this paper proposes a complete data dimensionality reduction solution that can effectively solve the high-dimensional small sample data problem, and code programming for this solution has been implemented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters

Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...

متن کامل

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

Kernel-based Fuzzy Feature Extraction Method and Its Application to Face Image Classification

The Hughes phenomenon (or the curse of dimensionality) shows two essential directions for improving the classification performance on high-dimensional and small sample size (SSS) problems. One is to reduce the dimensionality of applied data by feature extraction or feature selection methods. The other is to increase the training sample size. In recent years some kernel-based feature extraction ...

متن کامل

A Framework for Local Supervised Dimensionality Reduction of High Dimensional Data

High dimensional data presents a challenge to the classification problem because of the difficulty in modeling the precise relationship between the large number of feature variables and the class variable. In such cases, it may be desirable to reduce the information to a small number of dimensions in order to improve the accuracy and effectiveness of the classification process. While data reduc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCP

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014